The Shape of Digits

A Bayesian Topological Data Analytic Approach to Classification of Handwritten Digits

Thomas Reinke

Baylor University

Theophilus A. Bediako

Baylor University

August 14, 2025

Contents

  1. MNIST EDA
  2. Tradiotional ML
  3. Proposed Methodology
  4. Analysis
  5. TDA + ML
  6. Results/Future Work
  7. References

Exploratory Data Analysis

Distribution of training labels

Pixel Intensity

Training Data tSNE Visualization

Traditional ML

Neural networks

Feedforward neural network with structure:

  • Input layer: Consists of neurons that receives the input data. Each neuron in the input layer represents a feature of the input data
  • Hidden layer: One or more hidden layers placed between the input and output layers, responsible for capturing complex patterns
  • Output layer: Final output of the network; Number of neurons represents the number of digits

NN with regularization

  • Depending on model, # network weights > size of training data
    • This leads to overfitting
  • We considered two approaches to overfitting:
    • Dropout learning: Like RF, randomly removes fraction of units in layer during model fitting
    • Regularization: Impose penalties on parameters like lasso, ridge, etc.

Specific NN models considered

  • NN with dropout regularization
  • NN with ridge regularization
  • NN with lasso regularization

Multinomial logistic regression

  • Multinomial logistic regression equivalently represented by NN with no hidden layers
  • Output layer with softmax
    • \(f_m(X) = Pr(Y = m | X) = \frac{e^{Z_m}}{\sum_\limits{l \in K}e^{Z_l}}\)

NN Fitting

  • Train the network for 30 epochs with a batch size of 128
    • SGD updates weights for each batch
  • Images are presented in batches of 128, and SGD updates weights after each batch
  • Each epoch processes all 60,000 training images
  • Classification correct if largest output value matches target label

Proposed Methodology

TDA Workflow

Based on framework by (Maroulas, Nasrin, and Oballe 2020)

Analysis

Confusion Matrix Heatmap - NN dropout

Confusion Matrix Heatmap - NN ridge

Confusion Matrix Heatmap - Multinomial

Proposed Method Analysis

Proposed Method Analysis

Proposed Method Analysis

Proposed Method Analysis

method accuracy
multinomial 0.9856
dropout nn 0.9962
ridge nn 0.9946
lasso no 0.9946
proposed 0.2023

TDA + ML

Extension of (Garin and Tauzin 2019)

Filtering

  • Grayscale: The most natural filtration, using the image’s original pixel intensities directly. Pixels are included in the complex as their intensity value passes a growing threshold.
  • Height: Inspired by Morse theory, this filtration assigns a value to each pixel based on its projection onto a chosen direction vector, essentially measuring its “height” from a specific angle. Your code uses 8 different directions to capture features from multiple perspectives.
  • Radial: Assigns a value to each pixel based on its distance from a chosen center point. Your code uses 9 different centers to capture features originating from various locations.
  • Dilation: Assigns each pixel a value corresponding to its shortest distance to a foreground (value=1) pixel. This has the effect of “growing” or “dilating” the digit.
  • Erosion: The inverse of dilation. It’s achieved by applying the dilation filtration to the inverted image. This “shrinks” or “erodes” the digit.
  • Density: Assigns each pixel a value based on the number of foreground neighbors within a given radius. Your code uses radii of 2, 4, and 6 to capture line thickness at different scales.

Filtering

Analysis

Analysis

Analysis

Results/Future Work

ML + TDA Results

References

References

Garin, Adélie, and Guillaume Tauzin. 2019. “A Topological "Reading" Lesson: Classification of MNIST Using TDA.” CoRR abs/1910.08345. http://arxiv.org/abs/1910.08345.
Maroulas, Vasileios, Farzana Nasrin, and Christopher Oballe. 2020. “A Bayesian Framework for Persistent Homology.” SIAM J. Math. Data Sci. 2 (1): 48–74.